perf: specialize Builtin1 and Builtin3 apply paths#807
Merged
stephenamar-db merged 1 commit intodatabricks:masterfrom Apr 30, 2026
Merged
perf: specialize Builtin1 and Builtin3 apply paths#807stephenamar-db merged 1 commit intodatabricks:masterfrom
stephenamar-db merged 1 commit intodatabricks:masterfrom
Conversation
Motivation: Reduce allocation and dispatch overhead when one- and three-argument builtins are called through the dynamic function apply path. Modification: Add Builtin1.apply1 and Builtin3.apply3 overrides that directly call their structured evalRhs methods for exact positional arity, matching the existing Builtin2.apply2 specialization and falling back to the generic parent path otherwise. Result: Dynamic builtin calls avoid constructing temporary argument arrays on the exact-arity path. JVM tests and targeted hyperfine comparisons pass.
He-Pin
added a commit
to He-Pin/sjsonnet
that referenced
this pull request
Apr 30, 2026
Motivation: Reduce allocation overhead in common numeric rendering paths. Modification: 1. RenderUtils.renderDouble reuses pre-cached string representations for exact integer doubles in the range 0-255. 2. Materializer.stringify delegates number stringification to RenderUtils.renderDouble, removing its duplicate integer fast path. Result: Numeric materialization uses the shared renderDouble fast path. The Builtin1.apply1 / Builtin3.apply3 specialization from the original PR is already present in current master via databricks#807, so it is no longer part of this PR diff.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Split out from #763.
Motivation:
Reduce allocation and dispatch overhead when one- and three-argument builtins are called through the dynamic
Val.Func.apply1/apply3path.Key Design Decision:
Keep the optimization local and semantics-preserving.
Builtin2already has an exact-arityapply2override; this adds matchingBuiltin1.apply1andBuiltin3.apply3overrides. Exact positional calls directly invoke the structuredevalRhsoverload and skip constructing an intermediateArray. Non-exact paths still fall back to the generic parent application path.Correctness:
Builtin1.apply/Builtin3.applyexact positional behavior: force the suppliedEvalvalues, then call the typedevalRhs.Expr.ApplyBuiltin1/Expr.ApplyBuiltin3paths are unchanged; this only helps dynamic builtin calls such as a builtin stored in a local or returned from another function.Modification:
Builtin1.apply1.Builtin3.apply3.Validation:
./mill --no-server 'sjsonnet.jvm[3.3.7].compile'./mill --no-server 'sjsonnet.jvm[3.3.7].test'(141/141, SUCCESS)./mill --no-server 'sjsonnet.jvm[3.3.7].checkFormat'./mill --no-server '_.jvm[_].__.test'(1104/1104, SUCCESS)local f = std.length; f([1, 2, 3])->3local f = std.substr; f("abcdef", 1, 3)->"bcd"local f = std.substr; f(str="abcdef", from=1, len=3)->"bcd"Hyperfine:
Toolchain:
hyperfine 1.20.0--warmup 3 --min-runs 25for targeted dynamic builtin benchmarks--warmup 3 --min-runs 20forrealistic2./mill --no-server show 'sjsonnet.jvm[3.3.7].assembly'upstream/masteratc04fc8042067d8b5Targeted
Builtin1dynamic call benchmark:master builtin1_dynamicbranch builtin1_dynamicResult: statistically neutral in this hyperfine run.
Targeted
Builtin3dynamic call benchmark:master builtin3_dynamicbranch builtin3_dynamicResult: branch was faster in this run, but variance is high.
End-to-end
realistic2:master realistic2branch realistic2Result: branch was faster in this run; due JVM-startup and system noise, treat this as a non-regression signal rather than a guaranteed 1.27x speedup.